MATH333: Assessment Week 18

Assessment Deadline: Week 19, Tuesday 5pm.

A Bernoulli regression assumes YiBernoulli(μi), i=1,,n with free parameters (α,β) specified by logit(μi)=α+βxi where x is a given covariate.

  1. 1.

    Explain what is the saturated model and null model. [2]

  2. 2.

    For the observed responses y1,,yn, the fitted means are μ^1,,μ^n. Find an expression for the residual deviance of the model in terms of the fitted means. [3]

  3. 3.

    The data file titanic.dat (accessible from MOODLE) contains information about 1046 passengers on RMS Titanic in 1912 and whether or not they survived the sinking of the ship (survived = 1 or died = 0).

    In R, fit a Bernoulli regression model to the survival of passengers with the linear predictor:

    logit(μi)=α0+α1agei
    1. (a)

      What is the fitted probability of surviving, μ^i, for a 20 year old passenger? [1]

    2. (b)

      Evaluate a 95% confidence interval for the age co-efficient. Is age a good explanatory variable for predicting the survival of passengers? You may assume the regression coefficient estimator has a normal distribution with mean given by the true value of the regression coefficient and standard deviation approximated by the standard error from the R output. [2]

    Consider instead the following Bernoulli regression model where the linear predictor is based on the sex of the passengers:

    logit(μi)=β1femalei+β2malei

    where femalei is an indicator variable that is 1 if passenger i is female or 0 otherwise, and likewise for the indicator variable malei. Fit this regression model in R. Note that there is no intercept term.

    1. (c)

      What is the fitted probability of surviving for a male passenger? [1]

    2. (d)

      Is this model adequate in describing the variability in the data at the 5% significance level? [1]